Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Automatic Feature Extraction and Text Recognition From Scanned Topographic Maps

Identifieur interne : 000573 ( Main/Exploration ); précédent : 000572; suivant : 000574

Automatic Feature Extraction and Text Recognition From Scanned Topographic Maps

Auteurs : Aria Pezeshk [États-Unis] ; Richard L. Tutwiler [États-Unis]

Source :

RBID : Pascal:12-0077019

Descripteurs français

English descriptors

Abstract

A system for automatic extraction of various feature layers and recognition of the text content of scanned topographic maps is presented here. Linear features which are often intersecting with the text are first extracted using a novel line representation method and a set of directional morphological operations. Other graphical objects are then removed in several stages to obtain a text-only image. A custom defect model is subsequently used to create an artificial training set for a Hidden Markov Model-based character recognition engine. Finally, the recovered text is recognized using this multifont segmentation-free optical character recognition (OCR). Extensive testing is conducted to assess the performance of different stages of the proposed system. Furthermore, our custom OCR is shown to achieve a 94% recognition rate for the extracted text, thereby outperforming a commercial OCR used as a benchmark.


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en" level="a">Automatic Feature Extraction and Text Recognition From Scanned Topographic Maps</title>
<author>
<name sortKey="Pezeshk, Aria" sort="Pezeshk, Aria" uniqKey="Pezeshk A" first="Aria" last="Pezeshk">Aria Pezeshk</name>
<affiliation wicri:level="4">
<inist:fA14 i1="01">
<s1>Applied Research Laboratory, The Pennsylvania State University</s1>
<s2>University Park, PA 16804</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName>
<region type="state">Pennsylvanie</region>
<settlement type="city">University Park (Pennsylvanie)</settlement>
</placeName>
<orgName type="university">Université d'État de Pennsylvanie</orgName>
</affiliation>
</author>
<author>
<name sortKey="Tutwiler, Richard L" sort="Tutwiler, Richard L" uniqKey="Tutwiler R" first="Richard L." last="Tutwiler">Richard L. Tutwiler</name>
<affiliation wicri:level="4">
<inist:fA14 i1="01">
<s1>Applied Research Laboratory, The Pennsylvania State University</s1>
<s2>University Park, PA 16804</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName>
<region type="state">Pennsylvanie</region>
<settlement type="city">University Park (Pennsylvanie)</settlement>
</placeName>
<orgName type="university">Université d'État de Pennsylvanie</orgName>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">INIST</idno>
<idno type="inist">12-0077019</idno>
<date when="2011">2011</date>
<idno type="stanalyst">PASCAL 12-0077019 INIST</idno>
<idno type="RBID">Pascal:12-0077019</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000107</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000665</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000127</idno>
<idno type="wicri:doubleKey">0196-2892:2011:Pezeshk A:automatic:feature:extraction</idno>
<idno type="wicri:Area/Main/Merge">000579</idno>
<idno type="wicri:Area/Main/Curation">000573</idno>
<idno type="wicri:Area/Main/Exploration">000573</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a">Automatic Feature Extraction and Text Recognition From Scanned Topographic Maps</title>
<author>
<name sortKey="Pezeshk, Aria" sort="Pezeshk, Aria" uniqKey="Pezeshk A" first="Aria" last="Pezeshk">Aria Pezeshk</name>
<affiliation wicri:level="4">
<inist:fA14 i1="01">
<s1>Applied Research Laboratory, The Pennsylvania State University</s1>
<s2>University Park, PA 16804</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName>
<region type="state">Pennsylvanie</region>
<settlement type="city">University Park (Pennsylvanie)</settlement>
</placeName>
<orgName type="university">Université d'État de Pennsylvanie</orgName>
</affiliation>
</author>
<author>
<name sortKey="Tutwiler, Richard L" sort="Tutwiler, Richard L" uniqKey="Tutwiler R" first="Richard L." last="Tutwiler">Richard L. Tutwiler</name>
<affiliation wicri:level="4">
<inist:fA14 i1="01">
<s1>Applied Research Laboratory, The Pennsylvania State University</s1>
<s2>University Park, PA 16804</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName>
<region type="state">Pennsylvanie</region>
<settlement type="city">University Park (Pennsylvanie)</settlement>
</placeName>
<orgName type="university">Université d'État de Pennsylvanie</orgName>
</affiliation>
</author>
</analytic>
<series>
<title level="j" type="main">IEEE transactions on geoscience and remote sensing</title>
<title level="j" type="abbreviated">IEEE trans. geosci. remote sens.</title>
<idno type="ISSN">0196-2892</idno>
<imprint>
<date when="2011">2011</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt>
<title level="j" type="main">IEEE transactions on geoscience and remote sensing</title>
<title level="j" type="abbreviated">IEEE trans. geosci. remote sens.</title>
<idno type="ISSN">0196-2892</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>extraction</term>
<term>models</term>
<term>morphology</term>
<term>performances</term>
<term>segmentation</term>
<term>testing</term>
<term>topographic maps</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr">
<term>Extraction</term>
<term>Carte topographique</term>
<term>Modèle</term>
<term>Segmentation</term>
<term>Expérimentation</term>
<term>Performance</term>
<term>Morphologie</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">A system for automatic extraction of various feature layers and recognition of the text content of scanned topographic maps is presented here. Linear features which are often intersecting with the text are first extracted using a novel line representation method and a set of directional morphological operations. Other graphical objects are then removed in several stages to obtain a text-only image. A custom defect model is subsequently used to create an artificial training set for a Hidden Markov Model-based character recognition engine. Finally, the recovered text is recognized using this multifont segmentation-free optical character recognition (OCR). Extensive testing is conducted to assess the performance of different stages of the proposed system. Furthermore, our custom OCR is shown to achieve a 94% recognition rate for the extracted text, thereby outperforming a commercial OCR used as a benchmark.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>États-Unis</li>
</country>
<region>
<li>Pennsylvanie</li>
</region>
<settlement>
<li>University Park (Pennsylvanie)</li>
</settlement>
<orgName>
<li>Université d'État de Pennsylvanie</li>
</orgName>
</list>
<tree>
<country name="États-Unis">
<region name="Pennsylvanie">
<name sortKey="Pezeshk, Aria" sort="Pezeshk, Aria" uniqKey="Pezeshk A" first="Aria" last="Pezeshk">Aria Pezeshk</name>
</region>
<name sortKey="Tutwiler, Richard L" sort="Tutwiler, Richard L" uniqKey="Tutwiler R" first="Richard L." last="Tutwiler">Richard L. Tutwiler</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000573 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000573 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     Pascal:12-0077019
   |texte=   Automatic Feature Extraction and Text Recognition From Scanned Topographic Maps
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024